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Abstract 

This paper investigates the uniqueness of a nonnegative vector solution and the uniqueness of a 
positive semidefinite matrix solution to underdetermined linear systems. A vector solution is the unique 
solution to an underdetermined linear system only if the measurement matrix has a row-span intersecting 
the positive orthant. Focusing on two types of binary measurement matrices, Bernoulli 0-1 matrices and 
adjacency matrices of general expander graphs, we show that, in both cases, the support size of a unique 
nonnegative solution can grow linearly, namely 0{n), with the problem dimension n. We also provide 
closed-form characterizations of the ratio of this support size to the signal dimension. For the matrix case, 
t^J" ' we show that under a necessary and sufficient condition for the linear compressed observations operator, 

CO ' there will be a unique positive semidefinite matrix solution to the compressed linear observations. We 

further show that a randomly generated Gaussian linear compressed observations operator will satisfy 



this condition with overwhelmingly high probability. 

I. Introduction 

This paper is devoted to recover a "nonnegative" decision variable from an underdetermined system of 
linear equations. When the decision variable is a vector, "nonnegativity" means each entry is nonnegative. 
When the decision variable is a matrix, "nonnegativity" indicates that the matrix is positive semidefinite. 
The problem is ill-conditioned in general, however, we can correctly recover the vector or the matrix if 
the vector is sparse, or the matrix is low rank. 
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Finding the sparest vector among vectors satisfying a set of linear equations is NP-hard. One frequently 
used heuristic is Li-minimization, which returns the vector with the least L\ norm. Recently, there has 
been an explosion of research on this topic, see e.g., 0, 0-0, lfl4l . 1171 gives a sufficient condition 
known as Restricted Isometry Property (RIP) on the measurement matrix that guarantees the recovery of 
the sparest vector via L\ minimization. In many interesting cases, the vector is known to be nonnegative. 
|[T2l gives a necessary and sufficient condition known as the outwardly neighborliness property of the 
measurement matrix for L\ minimization to successfully recover a sparse non-negative vector. Moreover, 
recent studies (51, |fT3ll , |[20l suggested that a sparse solution could be the unique nonnegative solution 
there. This certainly leads to potentially better alternatives to L\ minimization as in this case any 
optimization problem over this constraint set can recover the solution. 

Motivated by networking inference problems such as network tomography, we are particularly interested 
in systems where the measurement matrix is a 0- 1 matrix. There have not been many existing results on 
this type of systems except a few very recent papers O, H, lf20"1 . |f29l . We focus on two types of binary 
matrices, Bernoulli 0-1 matrices and adjacency matrices of expanders, and provides conditions under 
which a sparse vector is the unique nonnegative solution to the underdetermined system. For random 
Bernoulli measurement matrices, we prove that, as long as the number of equations divided by the number 
of variables remains constant as the problem dimension grows, with overwhelming probability over the 
choices of matrices, a sparse nonnegative vector is a unique nonnegative solution provided that its support 
size is at most proportional to its dimension for some positive ratio. For general expander matrices, we 
further provide a closed-form constant ratio of support size to dimension under which a nonnegative 
vector is the unique solution. 

The phenomenon that an underdetermined system admits a unique "nonnegative" solution is not 
restricted for the vector case. Finding the minimum rank matrix among all matrices satisfying given 
linear equations is a rank minimization problem. Among the rank minimization problems, one particularly 
important class is the rank minimization problem for positive semidefinite matrices under compressed 
observations. For example, minimizing the rank of a covariance matrix, which is a positive semidefinite 
matrix, arises in statistics, econometrics, signal processing and many other fields where second-order 
statistics for random processes are used [16]. A positive semidefinite matrix is special in that its eigenval- 
ues (also its singular values) are nonnegative. In fact, the nuclear norm minimization heuristic for general 
matrices was preceded by the trace norm heuristic for positive symmetric matrices in rank minimization 
problems. While the general analytic frameworks and computational techniques, for example, |j25l , 1126*1 , 
are applicable to the rank minimization problems for positive semidefinite matrices, the special properties 



of positive semidefinite matrices may open the way to new structures and new analysis, which more 
efficient computational techniques may exploit to provide faster matrix recovery. 

Parallel to the influence of the nonnegative constraint on a vector variable, the positive semidefinite 
constraint on a matrix variable may dramatically reduce the size of the feasible set in rank minimization 
problems. In particular, we show that under a necessary and sufficient condition for the linear com- 
pressed observations operator, there will be a unique positive semidefinite matrix solution to compressed 
linear observations. We further show that a randomly generated Gaussian linear compressed observations 
operator will satisfy this necessary and sufficient condition with overwhelmingly high probability. This 
result is akin to the one in the vector case for the unique nonnegative solution, but the transition from a 
nonnegative vector to a positive semidefinite matrix requires very different analytical approaches. 

This paper is organized as follows. Section [II] discusses the phenomena that a sparse vector can be 
the unique nonnegative vector satisfying an underdetermined linear system. Focusing on 0-1 matrices, 
we prove that a sparse vector is a unique nonnegative solution as long as its support size is at most 
proportional to the dimension for some positive ratio. We further give a closed-form ratio of the support 
size and the dimension if the matrix is an adjacent matrix of an expander graph. Section [III] shows a low- 
rank matrix can be the unique positive semidefinite matrix satisfying compressed linear measurements. 
We provide a necessary and sufficient condition for this phenomenon to happen and prove the existence 
of compressed measurements satisfying the proposed condition. Numerical examples are discussed in 
Section [IV] and Section [V] concludes the paper. 

II. Unique Nonnegative Vector to an Underdetermined System 

How to recover a vector x 6 R n from the measurement y = Ax 6 W 71 , where A mxn (m < n) is the 
measurement matrix? In many applications, x is nonnegative, which is our main focus here. In general, 
the task seems impossible as we have fewer measurements than variables. However, if x is sparse, it can 
be recovered by solving the following problem, 

min||a;||o s.t. Ax = y, x > 0, (II-l) 

where the Lq norm || • ||o measures the number of nonzero entries of a given vector. Since dll.lb in 
general is NP-hard, people solve an alternative convex problem by replacing Lq norm with L\ norm 
where ||x||i = Yli \xi\.The L\ minimization problem can be formulated as follows: 

min l T x s.t. Ax = y, x > 0. (H-2) 



In fact, for a certain class of matrices, if x is sufficiently sparse, not only can we recover x from 
(111.21) . but also x is the only solution to {x | Ax = y, x > 0}. In other words, {x \ Ax = y, x > 0} is a 
singleton, and x can possibly be recovered by techniques other than L\ minimization. 

1H analyzed the singleton property of matrices with a row-span intersecting the positive orthant. Here 
we first show only these matrices can possibly have the singleton property. 

Definition 1 ( @). A has a row-span intersecting the positive orthant, denoted by A £ M + , if there 
exists a vector (3 > in the row space of A, i.e. 3h such that 

h T A = f > 0. 

There is a simple observation regarding matrices in M + . 

Lemma 1. Let a-i G W ri (i = l,2,...,n) be the i th column of matrix A, then A G M + if and only if 
^ P, where 

P = Conv(ai, a>2, a n ) = Ajaj|l T A = 1, A > 0} 

i 

Proof: If A G M + , then there exists h such that h T A = (3 T > 0. Suppose we also have G P, 
then there exists A > such that AX = and 1 T A = 1. Then (h T A)\ = (3 T X > as (3 > 0, A > and 
A / 0. But (h T A)\ = h T (A\) = as A\ = 0. Contradiction! Therefore i P. 

Conversely, if ^ P, there exists a separating hyperplane {x \ h T x + b = 0, h ^ 0} that strictly 
separates and P. We assume without loss of generality that h T + b < and h T x + b > for any 
point x in P. Then h T en > — b > 0,V1 Thus we conclude h T A > 0. 

■ 

The next theorem states a necessary condition on matrix A for {x \ Ax = Axq,x > 0} to be a 
singleton. 

Theorem 1. If {x \ Ax = Axq,x > 0} is a singleton for some xq > 0, then A G M + . 

Proof: Suppose A ^ M + , from Lemma Q] we know G Conv(ai, a2, a n ). Then there exists 
a vector w > such that Aw = and l T w = 1. Clearly w G Null(A) and w ^ 0. Then for any 
7 > we have A(xq + jw) = Axq + 'yAw = Axq, and xo + jw > provided xq > 0. Hence 

xq + -yw G {x | Ax = Axq, x > 0}. 



Theorem Q] shows that A belongs to M + is a necessary condition for an underdetermined system to 
admit a unique nonnegative vector. If A mxn is a random matrix such that every entry is independently 
sampled from Gaussian distribution with zero mean, then the probability that lies in the convex hull 
of the column vectors of A, or equivalently {x \ Ax = Axq,x > 0} is not a singleton for any xq > 0, 



is 1 - 2~ n+1 ("fe )( EH), which goes to 1 asymptotically as n increases if lim S < \. Thus, if 

fc=0 " " " n-5>+oo 

lim — < i, then for a random Gaussian matrix A, {x I Ax = Axq.x > 0) would not be a singleton 
with overwhelming probability no matter how sparse xq is. This phenomenon is also characterized in 

m. 

The property that {x \ Ax = Axq,x > 0} is a singleton can also be characterized in both high- 
dimensional geometry [13] and the null space property of A |[20l . We state two necessary and sufficient 
conditions in Theorem [2] 

Theorem 2 ( lfT3l . BUI ). The following three properties of A mxn are equivalent: 

• For any nonnegative vector xq with a support size no greater than k, the set {x \ Ax = Axq, x > 0} 
is a singleton. 

• The polytope P defined in AIL i| ) has n vertices and is k-neighborly. 

• For any w ^ in the null space of A, both the positive support and the negative support of w have 
a size of at least k + 1, 

Note that a polytope P is ^-neighborly if every set of k vertices spans a face F of P. F is a face of 
P if there exists a vector ap such that a^x = c, Vx G F, and OpX < c,\/x F and x G P. 

lfT3l (Corollary 4.1) shows that there exists a special partial Fourier matrix U with 2p + 1 rows such 
that {x | Qx = Qxq,x > 0} is a singleton for every nonnegative p-sparse signal xq. Here we will show 
the result is the "best" we can hope for in the sense that a matrix A should have at least 2p + 1 rows if 
{x | Ax = Axo,x > 0} is a singleton for every nonnegative p-sparse signal xq. 

Proposition 1. For a matrix A mxn (m < n), if {x \ Ax = Axq, x > 0} is a singleton for any nonnegative 
p-sparse signal xq, then m > 2p + 1. 

Proof: Pick the first m + 1 columns of A, denoted by oi, a?,, a m +i G M m . Then the equations 



m— 1 



m+1 




(E.3) 



i=l 



have m equations and m + 1 variables Ai, A2, A m+ i, and have a non-zero solution. 



From Theorem Q] we know that A must belong to M + , i.e. there exists h such that h T A = (5 T > 0. 
Taking the inner product of both sides of (III- 31) with h, we have 

m+l 

^2 ftA, = 0. (II.4) 
i=i 

Since > 0, from (111.41 ) we know A should have both positive and negative terms. Collecting positive 
and negative terms of A separatively, we can rewrite (III- 3b as follows, 

AjOj = - ^ Aid*, (II.5) 

where 7 P is the set of indices of positive terms of A and I n is the set of indices of negative terms. Note 
that \I P \ + \I n \ < m + 1. We also have A* = - X)ie/ n A« = r > from (III.4I ). 

Suppose m < 2p, then + \I n \ < m + 1 < 2p + 1, we assume without loss of generality that 
\Ip\ < P- Since {x | = Axq,x > 0} is a singleton for every nonnegative p-sparse signal xo, then 
from Theorem |2] Conv(ai, 02, a n ) is p-neighborly, which implies that for any index set / with |/| = p, 
there exists rj such that r/ T aj = c for any i £ I, and rj T ai < c for all i I. We consider specifically an 
index set /, which contains I p but does not contain /„, and its corresponding vector rj. Taking the inner 
product of both sides of 011.5b with 77, we would get rc on the left and some value strictly greater than 
rc on the right, and reach a contradiction. ■ 

Sparse recovery problems appear in different fields. Specific problem setup may impose further con- 
straints on the measurement matrix. We are particularly interested in network inference problems, in 
which the measurement matrix is a 0-1 routing matrix. Network inference problems attempt to extract 
individual parameters based on aggregate measurements in networks. There has been active research 
in this area including a wide spectrum of approaches ranging from theoretical reasoning to empirical 
measurements 11111. ifBl. Il23l. G41 Il30l. 

Since the measurement matrices in network inference problems are 0-1 matrices, the instances when 
A is a 0-1 matrix are our main focus. Section III- A I and III-BI prove that a sparse vector can be the 
unique nonnegative vector satisfying compressed linear measurements if the measurement matrix is a 
random Bernoulli matrix or an adjacency matrix of an expander graph. Moreover, the support size of 
the sparse vector can be proportional to the dimension, in other words, the support size of the unique 
nonnegative vector is 0(n) where n is the dimension, while the provable support size for uniqueness 

property in Q is 0(y/n). Besides, for any 6 = lim — > 0, the support size of a sparse vector that 

n— >+oo n 

is a unique nonnegative solution can always be 0(n), while for Gaussian measurement matrices, with 
high probability, {x \ Ax = Axq,x > 0} would not be a singleton for any nonnegative xq (with linearly 



growing sparsity) if 9 < | |[T3l . This also shows the fundamental difference between 0-1 measurement 
matrices and well studied Gaussian random measurement matrices. 

A. Uniqueness with 0-1 Bernoulli Matrices 

First we consider the uniqueness property with dense 0-1 Bernoulli matrix. The measurement matrix A 
is an (m + 1) x n measurement matrix, with each element in the first m rows of A being i.i.d. Bernoulli 
random variables, taking values '0' with probability | and taking values '1' with probability |. The last 
row of A is a 1 x n all '1' vector. We also assume the fraction ratio ^ is a constant 9 as the dimension 
n grows. It turns out that as n goes to infinity, with overwhelming probability there exists a constant 
7 > such that {x \ Ax = Axo,x > 0} is a singleton for any nonnegative (772 — l)-sparse signal xq. 
To see this, we first present the following theorem: 

Theorem 3. For any 9 > 0, there exists a constant 7 > such that, with overwhelmingly high probability 
as n — > 00, any nonzero vector w in the null space of the measurement A mentioned above has at least 
"fn negative and at least 7/1 positive elements. 

Proof: Let us consider an arbitrary nonzero vector w in the null space of A. Let S be the support 
set for the negative elements of w and let S c be the support set for the nonnegative elements of w. We 
now want to argue that, with overwhelmingly high probability, the cardinality | *S' | of the set S can not 
be too small. 

From the large deviation principle and a simple union bound, for any e > 0, with overwhelmingly 
high probability as n goes to infinity, simultaneously for every column of the measurement matrix, the 
sum of its (m + 1) elements will be in the range [\9{1 — e)n, \9(1 + e)n]. 

Since Aw = 0, 

A s w s + A S cW S c = 0, 

where As, ws, As<=, and wgc are respectively the part of matrix A and vector w indexed by the sets S 
and S c . 

Multiplying the 1 x m row vector [1, 1, 1] to both sides of this equation, we get 

U s w s + U S cW S , = 0, (II.6) 

where Us is an 1 x \S\ vector, each component of which represents the sum of the elements from the 
corresponding column of As; Us<= is an 1 x \S C \ vector, each component of which represents the sum of 
the elements from the corresponding column of As<=. 



From the concentration result of the column sums, we know 



and 



U s w s > -^0(1 + e)n||ws||i, 



U S cW S c > ^9(1 - e)n\\ws4i- 



But combining these two inequalities with (III.6I ). it follows that 



1 1 

-9(1 - e)n\\ws°\\i ~ 2^( 1 + ^"iFslli ^ °> 



which implies 



11 > . (II.7) 



||ws c ||l ~ 1 + e 

Now we look at the null space of the measurement matrix A. First, notice that the null space of A is 
a subset of the null space of the matrix A' comprising of the first 9n rows of A subtracted by the last 
row of A (the all '1' vector). Then the matrix A' is a random ±1 Bernoulli measurement matrix, which 
is known to satisfy the restricted isometry condition. Recall one result about the null space property of 
a matrix satisfying the restricted isometry condition: 

Lemma 2 ( O). Let h be any vector in the null space of A 1 and let T$ be any set of cardinality q. Then 

HftTolli < --. — -g-WnrsWu 

1 — <J2q 

where 52 q is the restricted isometry constant for sparse vectors with support set size no bigger than 2q, 
namely, 52 q is the smallest positive number such that for any set T with \T\ < 2q, and any vector y, the 
following holds: 

V^(l - 5 2q )\\yh < WAvh < \/^(! + MMh- 

Reasoning from Lemma |2] and (III.7b . after some algebra, we know immediately, for q = \S\, 52 q must 
satisfy 



S 2q > 



1 - e 
1 - e + V2(l + e) 



Fig. 1. The bipartite graph corresponding to matrix A in jll.8t 



We also know there exists a 7 > such that for any q < jn, with overwhelmingly high probability 

as n — > 00, 

62q < l-. + ^l + e)' 

thus with overwhelmingly high probability as n — > 00, the size of the negative support, namely can 
not be smaller than ^n. 

Similarly, we have the same conclusion for the cardinality of the support set of the positive elements 
for any nonzero vector from the null space of the matrix A. ■ 

Theorem [3] immediately indicates that {x \ Ax = Axq,x > 0} is a singleton for all nonnegative xq 
that is 771 — 1 sparse. Thus the support size of the unique nonnegative vector can be as large as 0{n), 
while the previous result in Q is 0(y/n). 

B. Uniqueness with Expander Adjacency Matrices 

Section Hl-AI discusses the singleton property with 0-1 Bernoulli matrices, here we focus on another 
type of 0-1 matrices where the matrix A is the adjacency matrix of a bipartite expander graph, fil . 
GUI , |[29l studied related problems using expander graph with constant left degree. We instead employ 
a general definition of expander which does not require constant left degree. 

Every m x n binary matrix A is the adjacency matrix of an unbalanced bipartite graph with n left 
nodes and m right nodes. There is an edge between right node i and left node j if and only if Aij = 1. 
Let dj denote the degree of left node j, and let di and d u be the minimum and maximum of left degrees. 
Define p = d\jd u , then < p < 1. For example, the bipartite graph in Fig. Q] corresponds to the matrix 



A in dlL8T) . Here di = 1, d u = 2, and p = 0.5. 

1110 
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Definition 2 ( 1221 ). A bipartite graph with n left nodes and m right nodes is an (a, 5) expander if for any 
set S of left nodes of size at most an, the size of the set of its neighbors T(S) satisfies |r(5)| > 5\E(S)\, 
where E(S) is the set of edges connected to nodes in S, and T(S) is the set of right nodes connected 
to S. 



Our next main result regarding the singleton property of an adjacency matrix of a general expander is 
stated as follows. 

Theorem 4. For an adjacency matrix A of an (a, 5) expander with left degrees in the range [di,d u ], if 
bp > ~ 0.618, then for any nonnegative k-sparse vector xq with k < i+$ p n > { x \ Ax = Axo,x > 

0} is a singleton. 

Proof: From Theorem |2j in order to prove that {x \ Ax = Axo,x > 0} is a singleton for any 
nonnegative y^^n-sparse vector xq, we only need to argue that for any nonzero w such that Aw = 0, 

+ 1, where SL and S + are negative support and positive 



we have \S-\ > 



+ 1 and 15+1 > 



l+Sp 

support of w respectively. 

We will prove by contradiction. Suppose without loss of generality that there exists a nonzero w in 
Null(^4) such that \S-\ = s < j^^, then the set E(S-) of edges connected to nodes in S- satisfies 

dis < \E(S-)\ < d u s. 

Then the set T(5_) of neighbors of S- satisfies 

d u s > \E(S-)\ > |r(5_)| > 5\E(S-)\ > Sdis, 

where the second to last equality comes from the expander property. 

Notice that T(S-) = T(S+) = T(S- U S + ), otherwise Aw = does not hold, then 

JW |r(5_)| 6d lS 
\S+\ > — = — > —r- = dps. 

CL U (l u CL U 

Now consider the set U S + , we have |SL U S + \ > (1 + 5p)s. Pick an arbitrary subset S € 5_ U S + 
such that |5| = (1 + 5p)s < an. From expander property, we have 



|r(5)| > S\E(S)\ > 5di\S\ = 5p{\ + 5p)d u s > d u s. 



The last inequality holds since 8p(l + dp) > 1 provided bp > But |r(5)| < \T(S- U S+)\ = 

\F(S-)\ < d u s. A contradiction arises, which completes the proof. ■ 

Corollary 1. For an adjacency matrix A of an (a, 5) expander with constant left degree d, if 5 > v ^~ 1 , 
then for any nonnegative k-sparse vector xq with k < j^n, {x \ Ax = Axo,x > 0} is a singleton. 

Theorem [4] together with Corollary Q] is an extension to existing results. Theorem 3.5 of EDI shows 
that for an (a, 5) expander with constant left degree d, if d5 > 1, then there exists a matrix A (a 
perturbation of A) such that {x \ Ax = Axo,x > 0} is a singleton for every nonnegative <5an-sparse x$. 
Our result instead can directly quantify the sparsity threshold needed for a vector to be a unique solution 
to compressed measurements induced by A, not its perturbation. (H discussed the success of L\ recovery 
of a general vector x for expanders with constant left degree. If we apply Theorem 1 of [4] to cases 
where x is known to be nonnegative, the result can be interpreted as that {x \ Ax = Axq,x > 0} is a 
singleton for any nonnegative ^n-sparse vector xq if 5 > | « 0.833. Our result in Corollary Q] implies 
that if 5 > v ^~ 1 0.618, xq can be yq^n-sparse and still be the unique nonnegative solution. 

ifTTll . ll27l proved that for any m, n and 5 > 0, there exists an (a, 5) expander with constant left degree 
d for some d and a > 0, and such an expander can be generated through random graphs. There also exist 
explicit constructions of expander graphs |[T0l . Combining the results with Corollary [T] for any m and 
n, we can generate an (a, 5) expander with adjacency matrix A such that {x \ Ax = Axq,x > 0} is a 
singleton for any nonnegative /cn-sparse xq, where k = > 0. Thus, same as Bernoulli 0-1 matrices, 
the adjacency matrix A of an (a, 5) expander has the property that {x \ Ax = Axo, x > 0} is a singleton 
as long as the support size of xq is 0(n). We further provide an explicit constant of the ratio of 
the support size to the dimension. Note that this result is independent of the ratio ^, while as discussed 
earlier, if the matrix has i.i.d. Gaussian entries and lim — < i, {x I Ax = Axq,x > 0} is not a 

n->+oo n A 

singleton despite the sparsity of xq. 

III. Unique Positive Semidefinite Solution to an Underdetermined System 

A. When is Low-rank Positive Semidefinite Solution the Unique Solution? 

Section [II] studies the case when a sparse nonnegative vector is the only nonnegative solution to the 
system of compressed linear measurements. Here we extend the problem into the matrix space. Let X 
be an n x n matrix decision variable. Let A : M™ xn — > W 71 be a linear map, and let b G W 71 . The main 



optimization problem under study for low-rank matrix recovery is 

minimize rank(X) 

(III A) 

subject to A(X) = b . 

In this paper, we are interested in looking at the property of the feasible set {X' \ A(X') = b}. 
Indeed, if there exists a X' such that A(X') = b, then X' plus any matrix in the null space of A also 
satisfies A(X') = b. However, in applications, one is often interested in recovering a positive semidefinite 
symmetric matrix X, (X y and X 6 S n , where S n is the set of n x n real symmetric matrices) from 
compressed observations. To determine a positive semidefinite symmetric matrix X, we only need to 
determine n (" 2 +1 ) unknowns in the upper triangular part of X. Thus the linear operator A in (lIH.il ) can 
be reduced to an operator A(X ± ) : M ( 2 ' — >■ W 71 , where m < and X 1 - denotes the upper 

triangular part of the n x n symmetric matrix X. The null space of A is a subset of R ( 2 ' such that 
each point from this set, arranged accordingly as the upper triangular part of Y of a n x n matrix Y, 
satisfies A{Y) = 0eK m . 

Now we ask this question, can we uniquely determine the positive semidefinite symmetric matrix X 
from A(X) = b, namely can the feasible set {X' \ A(X') = b, X' y 0, X' G S n } be a singleton? 
The next theorem gives an affirmative answer to this question, and shows that if the linear measurement 
operator satisfies certain conditions and the positive semidefinite symmetric matrix X is of low rank, 
then the feasible set {X' \ A(X') = b, X' y 0, X' £ S n } is a singleton, namely X is not only the only 
low-rank solution, but also the only possible solution. 

Theorem 5. Let X be a positive semidefinite symmetric matrix of rank r and A : K ' 2 — y R m 
be a linear operator which operates on the upper triangular part of X, where m < n ( n + 1 \ Then 
{X' I A(X') = A(X),X' y 0,X' £ S n } is a singleton for all X with rank no greater than r, if and 
only if for every non-all-zero matrix generated from the null space of A has at least r + 1 negative 
eigenvalues. 

Proof: Sufficiency: we first show that if every non-all-zero symmetric matrix generated from the 
null space of A has at least r + 1 negative eigenvalues, then {X' \ A(X') = A(X),X' y_0,X'£ S n } 
is a singleton. Suppose instead there exist a X" £ S n such that A(X") = b, then the upper triangular 
part of X" — X is in the null space of the linear operator A. By the assumption, we know that X" — X 
has at least r + 1 negative eigenvalues. Since X" — X is a symmetric matrix, its eigenvalues are real. 



For a matrix, we denote these eigenvalues in an nondecreasing order, namely, 

Ai < A2 < • • ■ A n _i < A n . 

By a classical variational characterization of eigenvalues [19], if A and B are both n x n Hermitian 
matrices and B has rank at most r, then \k{A-\-B) < \k+ r {A), for k = 1,2, ...,n — r. By taking k = 1, 
B = X and A = X" - X, we have 

Ai(X") = Ai((X" - X) + X) < A r+1 (X" - X) < 0, 

by the eigenvalue assumption for X" — X. But then X" is not a positive semidefinite matrix. This 
contradiction shows that X is the only element in the the set {X' | A(X') = A(X),X' y 0,X' G S 1 "}. 

Necessity: we need to show that if there exists a nontrivial symmetric matrix (say Y), with its upper 
triangular part from the null space of the linear operator A, has at most r negative eigenvalues, then we 
can find an X such that {X' \ A(X') = A{X),X' y 0,X' e S n } is not a singleton. Indeed, since Y 
is a symmetric matrix, it can be diagonalized by some unitary matrix U, namely Y = UMJ~ l , where 
A is a diagonal matrix with A^j = \i(Y). We then pick X = UA!U~^, where A' is a diagonal matrix, 
and A^ i > max{— Aj, 0} for 1 < i < r and A^ i = for i > r. Thus X is a positive semidefinite matrix 
with rank no larger than r (note that the eigenvalues of A' are not necessarily arranged in nondecreasing 
order with respect to i ). Then obviously X + Y = UK"U~ l , where the diagonal entries in the diagonal 
matrix A" = A' + A are all nonnegative. Since Y is not a all-zero matrix, X + Y is an element in the 
set {X' I A(X') = A(X),X' y 0,X' e S n } besides X. 

■ 

Theorem [5] establishes the necessary and sufficient condition for the uniqueness of low-rank positive 
semidefinite solution under compressed linear measurements. However, checking this condition for a 
specific set of linear measurements seems to be a hard problem and, in addition, it is not clear whether 
asymptotically there exist such linear compressed measurements satisfying the given condition. So in 
Section IIII-B1 we will investigate whether a set of linear measurements (namely the linear measurement 
A(-)) sampled from a certain distribution will satisfy this condition. 

B. The Null Space Analysis of the Gaussian Ensemble 

We say that the linear operator A : R " 2 ' — > W 11 is sampled from an independent Gaussian ensemble 
if its i-th (1 < i < m) operation, denoted by A4 : M 2 — > K, is the inner product 

(X, A { ) = trace(X T Ai), 



where A4 is an n x n symmetric matrix with independent random elements in its upper triangular part. 
On the diagonal of A{, its elements are distributed as real Gaussian random variables iV(0, 1) and, in the 
off-diagonal part, its elements are distributed as iV(0, \). Across the index i, the Ai's are also sampled 
independently. One main result of this paper can be stated in the following theorem. 

Theorem 6. Consider a linear operator A : K ' 2 — > ffi m sampled from an independent Gaussian 
ensemble. Let m = a x MZLtli Then there exists a constant a < 1, independent of n, such that with 
overwhelming probability as n goes to oo, any nonzero symmetric n x n square matrix with its upper 
triangular part from the null space of the linear operator A has at least £n negative eigenvalues, where 
£ > is a constant that is independent of n. Thus with overwhelmingly high probability, any positive 
semidefinite matrix of rank no larger than £n — 1 will be the singleton in the set {X' \ A(X') = 
A(X),X' h 0,X' G S n }. 

Note that in Theorem [6l the constant £ may depend on a. Theorem [6] confirms that there indeed exists 
a sequence of linear operators such that every nonzero element in their null spaces necessarily generates 
a symmetric matrix having a sufficiently large number (£n) of negative eigenvalues. The "guaranteed" 
number of negative eigenvalues is highly nontrivial in the sense that t;n grows proportionally with n while 
the null space for the linear operator A has dimension at least (1 — a) ; which grows proportionally 
with n 2 . This seems counterintuitive at first sight: a null space of such a large dimension should have 
been able to accommodate at least one point which generates a symmetric matrix with very few or even 
none negative eigenvalues. 

The main difficulty in proving Theorem [6] is to show that for all the nonzero symmetric matrices 
generated from the points in the null space of the random linear operator A, the claimed fact holds 
universally with overwhelming probability. This seems to be a daunting job since the null space of every 
linear operator is a continuous object and there are uncountably many symmetric matrices that can be 
generated from it. In fact, we have the following probabilistic characterization with a shortened proof for 
the null space of the linear operator sampled from the independent Gaussian Ensemble. 

Lemma 3. If the linear operator A(X) : R ^~ — > R m is sampled from independent Gaussian Ensemble, 
by representing the vectors from the null space of A by n ^" 2 +1 ^ x 1 column vectors, the distribution of its 
null space is (almost everywhere) equivalent to the distribution of a — m) -dimensional subspace 

in M. ' 2 whose basis can be represented by a n ^ n ^ x _ m ) matrix Z whose elements are 

independent Gaussian random variables, N(0, l)for elements in the rows corresponding to the n diagonal 



elements of X and N(0, |) for elements in the rows corresponding to the — off-diagonal elements. 

Proof: This lemma follows from the fact that a random matrix with zero mean i.i.d. Gaussian 
distributed entries generates a random subspace whose distribution is rotationally invariant (namely the 
distribution of that random subspace does not change when it is rotated by a unitary rotation). We also 
note that if a random subspace has a rotationally invariant distribution, its null space also has a rotationally 
invariant distribution, which again can be generated by a matrix with zero mean i.i.d. Gaussian distributed 
entries of appropriate dimensions (with probability 1, the dimension of this null space is ( 2 _ m ))_ 
With a normalization for the variance of the Gaussian distributed entries, we have this lemma. ■ 
By Lemma [3l the null space of the linear operator A sampled from independent Gaussian Ensemble 
can be represented by 

{z | z = &,io£l^- m }, 

where Z is a x ( n ( n + 1 ) _ m ) ma trix as mentioned in Lemma [3] 

We should first notice that in order to prove the property that "any nonzero symmetric n x n square 
matrix with its upper triangular part from the null space of the linear operator A has at least £n negative 
eigenvalue" , we only need to restrict our attention to prove that property for the set of symmetric matrices 
generated by the set of points 

{z | z = -^=Zw,w G M 2 ^"™ \\w\\ 2 = 1}, 
\Jn 

in the null space of the linear operator A. 

Building on this observation, we can proceed to divide the formal proof of Theorem [6] into three steps. 
Firstly, since we can not show directly our theorem for every point in the null space, instead we first try 
to discretize the sphere 

{w | \\w\\ 2 = l,u; € JT^"™} 

into a finite e-net consisting of a finite number of points on the sphere such that every point in the set 

{w | \\w\\ 2 = l,w G R— m } is in the e (in terms of Euclidean distance) neighborhood of at least one 

point from the e-net. Formally, an e-net is a subset S C {w \ \\w\\2 = l,w G R " ^ '- m } suc h that for 
every point t in the set {w \ \\w\\2 = l,w G R ' 2 ~ m }, one can find s in S such that \\t — s 1 1 2 < £• The 
following lemma is well known in high dimensional geometry about the size estimate of such a e-net, 
for example, see Ell : 



Lemma 4. There is an e-net S of the unit sphere o/K ' 2 ' m of cardinality less than (1 + |) 2 m t 

„(„ + l)_2m 

which is no larger than e 

Secondly, using the large deviation technique or concentration of measure result, we establish the 
relevant properties for the symmetric matrices generated from these discrete points on the e-net. For 
example, the symmetric matrices have a large number of negative eigenvalues with overwhelming prob- 
ability. Thirdly, we show how property guarantees on the e-net can be used to establish the null space 
property for the whole null space of the linear operator A. Section IIII-CI and IIII-DI are then devoted to 
completing these steps to prove Theorem [6] 

C. Concentration for a Single Point 

We take any point w from the e-net for the set {w \ \\w\\2 = 1, w € M ' 2 )_m } an( j its corresponding 
point z = -^Zw in the null space of the linear operator A, where Z is the random basis as mentioned 
in Lemma [3] Then we argue that the symmetric matrix G with its upper triangular part generated from z 
has many negative eigenvalues with overwhelming probability. It is obvious that with the i.i.d. Gaussian 
probabilistic model for Z, the elements of G are independently Gaussian distributed N(0, ~) random 
variables on the diagonal and independently Gaussian distributed N(0,j^) on the off-diagonal. 

Theorem 7. The smallest a\n (a\ < ^) eigenvalues of the symmetric matrix G with its upper triangular 
part generated from z will be upper bounded by c + 5 with overwhelming probability 1 — e~ Cl " 2 , where 
c is a negative number as determined from the semicircular law 

" J —00 

5 is an arbitrarily small positive number, c\ is a positive constant independent of n and 1 is the indicator 
function. 

Proof: Indeed Theorem [7] can be derived from known large deviations or concentration of measure 
results for the empirical eigenvalue distribution of random symmetric Gaussian matrix HI lfT8l . Obviously, 
G has n real eigenvalues (Aj) 1<i<n arranged in nondecreasing order and its spectral measure fx n = 
n Y^i=i = h SiLi ^(^ — ^»)> wnere <5(") is the delta function. As in 0]], we denote the space of 
probability measure on R as M.J (R) and will endow A^^(M) with its usual weak topology. [T] then 
gives the following large deviation result for the empirical eigenvalue distribution for the matrix G, 



Theorem 8 ( |1]). Let p, € Mi(M), define the rate function 

h(M) = \(J *?d»{x) - S( M )) - | - ilog(2), 
where is the non commutative entropy 

£(ju) = y y lo s(l x - v\) d K x ) d Kv)- 

Then 

• — I\ is well defined over the set Aif(M.) and takes its value in [0, +00); 

- l\ (/i) is infinite as long as p, satisfies the following: 

* f x 2 dx = +00 

* there exists a subset A of R with a positive \i mass but null logarithmic capacity, i.e. a set 
A such that n{A) > and 

j(A) = exp{- inf / / log(i — - — r ) du(x) dv{y)} = 
ueMtm J J \ x ~ y\ 

- I\{p) is a good rate function, namely {I\(p) < M} is a compact subset of M\{M) for M > 0. 

- I\ is a convex function on M^{M). 

- I\ achieves its minimum value at a unique probability measure on R which is described by the 
Wigner's Semicircle Law. 

• The law of the spectral measure fi n = ^ Yn=l satisfies a full large deviation principle with good 
rate function I\ and in the scales n 2 , that is, for any open subset O of Ait (K.), 

liminf -\log(P(fi n e O)) > -inf Ji 

n— 5>oo n O 

for any closed subset F of of M^iM), 

limsup^ log(P(/i n G F)) < - inf Ji 

n— >oo n F 

We take c as in the statement of Theorem [7] and then the set of spectral measure A satisfying 
the statement of Theorem |7] can be denoted by Y^i=\ 1a«<c+5 > a i}> whose complement is then 

Now we take a continuous function / equal to l x < c over the region (—00, c], equal to on [c+5, +00), 
and linear in between over the region [c, c+d]. Then the set A is included in the following set of probability 
measure 

{- £ /(A,) < ai } = {£»(/) < a x } C {/,(/) < a,} 4 B(p), 

i=l 



with fi n = i Ym=i and as tne integral of / over /i. 

This set B is closed for the weak topology and so we can apply the large deviation principle as in 
ID. To get that 

1 1 n 

limsup-g \ogP({-y l\. <c+5 < ai\) <-m.il 



n— >oo M Tl . B 

i=l 

with / as defined in Theorem [8j from the definition of ct\, we simply know that the semi-circle law does 
not belong to the set B and so we can conclude that inf b I > 0. This is because the rate function I is 
a good rate function which achieves its unique minimum at the semicircle law. 

■ 

Following Theorem |7J we know that with overwhelming probability, the symmetric matrix generated 
from a single point on the e-net will be very likely to have a large number (proportional to n) of negative 
eigenvalues. In Section IIII-D1 we will show how to synthesize the results for isolated points so that we 
can prove the eigenvalue claim for the null space of the linear operator A. 

D. Concentration for the Null Space: e-net Analysis 

Building on the concentration results for the single point on the e-net, we now begin proving the claims 
in Theorem [6] for all the possible symmetric matrices generated from the set 

{z | z = Zw,w G M^^ m }, 

where Z is a TO ^ n 2 +1 ^ x ( n ( n + 1 ) _ m ) matrix as mentioned in Lemma [3] 

First, we make a simple observation regarding every point w on the Euclidean sphere 

{w | \\w\\ 2 = l,w G M 2 ^-™}. 

Since S is an e-net on the sphere, we can find a point u>o G S with [ | -Lf o ] 1 2 = 1 suc h that \\w— itfolh < e. 
For the error term w — wq, we can still find a point w\ on the e-net S such that 

|| w; — wq — \\w — ^olb^ilb < e\\vj — wolb < e 2 - 
By iterating this process, we get that any w on the unit Euclidean sphere can be expressed as 

oo 

w = w + 2j Uwi, (III.2) 
1=1 

where \ti\ < e % for i > 1 and Wi G S for 2 > 0. 

Before we proceed further to look at the spectrum of the symmetric matrix B w generated from Zw, 
we state the following theorem by Hoffmann and Wielandt |fl9l . 



Theorem 9 ( M191D . Let A, E E M n , assume that A and A + E are both normal, let Ai, A n &e f/ie 
eigenvalues of A in some given order, and let Ai, A n £e the eigenvalues of A + E in some order. Then 
there exists a permutation o"j of the integers 1, 2, n smc/i f/iaf 



A; 



< II-EII 



.i=i 



Now we can give a closer study of the n x n symmetric matrix B w generated from -^Zw. From the 



e-net decomposition (IIII.2I ). it follows that 

Br,,, = B 



w 



i=i 



where B w . is the symmetric matrix generated from A^Zwi for i > 0. 

Since we can thus view as £? Mo plus some perturbation, using Theorem |9j there exists a permutation 
a", of the integers 1, 2, n such that 



i=l 



< II ^ ij-Bw, lb, 
i=i 



(111.3) 



where Aj, 1 < i < n, and \%, 1 < i < n, are the eigenvalues of the B w and B Wo arranged in an increasing 
order, respectively. 

But from the triangular inequality, we know 



tjB w . H2 < y~]itiiii-B w j 



i=i 



i=i 

n 



< Ve J CiVn<- , 

1 — e 



(III.4) 



i=i 



where we use the fact (derivations omitted) that with overwhelmingly high probability (the complement 
probability exponent in the scale of — n 2 ) as n — > 00, H-B^lb is upper bounded by C\y/n simultaneously 
for all w G 5 with Ci as a constant independent of n. 

Now we can officially argue that the number, say k, of negative eigenvalues of B w can not be small. 
In particular, we will upper bound (ot\n — k), where oi\ is as defined in Theorem [7] for B Wo . By picking 
c to be negative and 5 to be small enough in Theorem |7J c + 5 will be negative. Then for whatever 
ordering of the eigenvalues of B w , we have 



1 — Aj 



i=l 



> (a x n- k)\c + 5\' 



(III.5) 
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(a) 50 x 200 0-1 matrix (b)100 x 200 0-1 matrix 

Fig. 2. Comparison of L\ recovery and singleton property for (a) 50 x 200 0-1 matrix and (b) 100 x 200 0-1 matrix 



because at least {a.\n— k) negative eigenvalues (smaller than c+5) of B Wq will be matched to nonnegative 
eigenvalues of B w in Theorem [9] 

Connecting (1HI.3I ). (IIII.4I ) and (IIII.5I ). we have with overwhelming probability, simultaneously for every 
w on the Euclidean sphere, if (ot\n — k) > 0, (otherwise k already nicely bounded) 



^{am-k)\c + 8\ 2 < 



So 



k > a\n 



e 2 Cfn 



(l-e) 2 |c + ,5| 2 ' 

which implies if we pick e small enough, the number of negative eigenvalues of B w will be proportionally 
growing with n. Note that for any e > 0,c < 0, 5 > and C% > 0, we can always find a large enough 



a 



2m 
i(n+l) 



to make sure that the union bound exponent from the cardinality of the e-net is overwhelmed 



by both the negative large deviation exponent for the spectral measure and the negative large deviation 
exponent for the Forbenius norm of the random matrix. In summary, we have arrived at a complete proof 
of Theorem [6] 



IV. Simulation 

In the vector case, we generate a random 0-1 matrix A mxn with i.i.d. entries and empirically study the 
uniqueness property and the success of L\ minimization for nonnegative vectors with different sparsity. 
Each entry of A takes value 1 with probability 0.2 and value with probability 0.8. The size of A is 50 x 




200 and 100 x 200 respectively. For a sparsity k, we select a support set S with size \S\ = k uniformly 
at random, and generate a nonnegative vector xo on S with i.i.d. entries uniformly on the unit interval. 
Then we check whether U = {x \ Ax = Axq,x > 0} is singleton. This can be realized as follows. We 
minimize and maximize the same objective function d 7 x over U, where d is a random vector in R n . 
Note that if U is not a singleton, then the set {d E 1" d T x = d T xo,\/x 6 (7} has measure 0. Thus 
the probability that the minimizer and the maximizer are the same when U is not a singleton is 0. We 
generate several different d's and claim U to be singleton if the minimizer and the maximizer are the 
same for every d. For each instance, we also check whether L\ minimization can recover xq from Axq 
or not. Under a given sparsity k, we generate 200 xo's and repeat the above procedure 200 times. 

We fix n = 200, and m is 50 in Fig. Oa) and 100 in Fig. 12b). When ^ increases from \ to |, the 
support size of a sparse vector which is a unique nonnegative solution increases from 0.05n to 0.19n. 
Note that when ^ = i, for this 0-1 matrix, the singleton property still exists linearly in n, while for a 
random Gaussian matrix, with overwhelming probability no vector can be a unique nonnegative solution. 
Besides, the thresholds where the singleton property breaks down and where the fully recovery of L\ 
minimization breaks down are quite close. 

In the matrix case, we generate a 40 x 40 matrix G such that all the elements are i.i.d. N(0, 1), then 
A = t;(G + G t ) has its diagonal elements distributed as iV(0, 1) and off-diagonal elements distributed as 
iV(0, ^). We generate m such matrices Aj's as the linear operator A, m is 500 and 600 respectively for 
comparison. X is a low-rank positive semidefinite symmetric matrix. We increase the rank of X from 



to 0.4n, and for each fixed rank, generate 200 X's randomly. For each X, we minimize and maximize 
the same objective function (D, X') over the set V = {X' \ A(X') = A(X), X' h 0, X' £ S n }, where 
D is random matrix with i.i.d. iV(0, 1) entries. Similarly to the vector case, if V is not a singleton, 
then the set {D | (D,X') = (D, X) ,\/X' € V} has measure 0. Thus the probability that the minimizer 
and the maximizer are the same when V is not a singleton is 0. We generate several different D's 
and claim the set V to be a singleton if the minimizer and the maximizer of (D,X'} from the set 
{X' | A(X') = A(X),X' h 0,X' e S n } are the same for every D. As indicated by Fig. El when 
m = 500, the singleton property holds if rank(X) is at most 2, which is 0.05ra. When m increases to 
600, the singleton property holds if rank(X) is at most 8, which is 0.2n. 

V. Conclusion 

This paper studies the phenomenon that an underdetermined system admits a unique nonnegative vector 
solution or a unique positive semidefinite matrix solution. This uniqueness property can potentially lead 
to more efficient sparse recovery algorithms. We show that only for a class of matrices with a row span 
intersecting the positive orthant that {x \ Ax = Axq, x > 0} could possibly be a singleton if xq is sparse 
enough. Among these matrices, we are interested in 0-1 matrices which fit the setup of network inference 
problems. For Bernoulli 0-1 matrices, we prove that with high probability the unique solution property 
holds for all £>sparse nonnegative vectors where k is O(n), instead of the previous result 0{^/n). For the 
adjacency matrix of a general expander, the same phenomenon exists and we further provide a closed- 
form constant ratio of k to n. One future direction is to obtain uniqueness property threshold for a given 
measurement matrix. 

For the matrix case, we develop a necessary and sufficient condition for a linear compressed operator to 
admit a unique feasible positive semidefinite matrix solution. We further show that this condition will be 
satisfied with overwhelmingly high probability for a randomly generated Gaussian linear compressed 
operator with vastly different approaches from those used in vector case. Computing explicitly the 
threshold £ as a function of a, for the uniqueness property to happen will be one part of future works. 
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